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Abstract 

In this paper, we determine the complexity of the satisfiability problem 
for various logics obtained by adding numerical quantifiers, and other con- 
structions, to the traditional syllogistic. In addition, we demonstrate the 
incompleteness of some recently proposed proof-systems for these logics. 



1 Introduction 

Inspection of the argument 

At least 13 artists are beekeepers 

At most 3 beekeepers are carpenters , . 

At most 4 dentists are not carpenters ^ ' 

At least 6 artists are not dentists. 

shows it to be valid: any circumstance in which all the premises are true is one 
in which the conclusion is true. Considerably more thought shows the argument 

At most 1 artist admires at most 7 beekeepers 

At most 2 carpenters admire at most 8 dentists 

At most 3 artists admire at least 7 electricians 

At most 4 beekeepers are not electricians (2) 

At most 5 dentists are not electricians 

At most 1 beekeeper is a dentist 

At most 6 artists are carpenters 

to be likewise valid — assuming, that is, that the quantified subjects in these 
sentences scope over their respective objects. This paper investigates the com- 
putational complexity of determining the validity of such arguments. 

Argument (TT]) is couched in a fragment of English obtained by extending the 
syllogistic (the language of the syllogism) with numerical quantifiers. Adapting 
the terminology of de Morgan [T] , we call this fragment the numerically definite 
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syllogistic. When its sentences are expressed, in the obvious way, in first-order 
logic with counting quantifiers, the resulting formulas feature only one variable. 
Argument @ is couched in a fragment of English obtained by extending the 
numerically definite syllogistic with transitive verbs. We call this fragment the 
numerically definite relational syllogistic. When its sentences are expressed, 
in the obvious way, in first-order logic with counting quantifiers, the resulting 
formulas feature only two variables. 

The satisfiability and finite satisfiability problems for the two- variable frag- 
ment of first-order logic with counting quantifiers are known to be NEXPTIME- 
complctc. Surprisingly, however, no corresponding results exist in the literature 
for the other fragments just mentioned. The main results of this paper are: (i) 
the satisfiability problem (= finite satisfiability problem) for any logic between 
the numerically definite syllogistic and the one-variable fragment of first-order 
logic with counting quantifiers is strongly NP-complete; and (ii) the satisfiabil- 
ity problem and finite satisfiability problem for any logic between the numeri- 
cally definite relational syllogistic and the two-variable fragment of first-order 
logic with counting quantifiers are both NEXPTIME-complete, but perhaps not 
strongly so. We investigate the related problem of probabilistic (propositional) 
satisfiability, and use the results of this investigation to demonstrate the incom- 
pleteness of some proof-systems that have been proposed for the numerically 
definite syllogistic and related fragments. 

2 Preliminaries 

In the sequel, we employ first-order logic extended with the counting quantifiers 
3<C, 3>c and 3 = c*j f° r an y C > 0, under the obvious semantics. Note that, 
in this language, Bxtfi is logically equivalent to 3>ix<fi 1 and Vxtfi is logically 
equivalent to 3<qx->(/). The one-variable fragment with counting quantifiers, here 
denoted C , is the set of function-free first-order formulas featuring at most one 
variable, but with counting quantifiers allowed. We assume for simplicity that 
all predicates in C 1 have arity at most 1. 

We define the fragment Af 1 to be the set of C 1 -formulas of the forms 



where p and q are unary predicates. Linguistically, we think of unary predicates 
as corresponding to common nouns, and the formulas ([3]) to English sentences 
of the forms 



3> c x(p(x) A q(x)) 
3< c x(p(x) A q(x)) 



3> c x(p(x) A ^q(x)) 
3<cx(p(x) A -•g(x)), 



(3) 



At least C p are q 
At most C p are q 



At least C p are not q 
At most C p are not q, 



(4) 



respectively. (We have simplified the presentation here by ignoring the issue 
of singular/plural agreement; this has no logical or computational significance, 
and in the sequel, we silently correct any resulting grammatical infelicities.) We 
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call the fragment of English defined by these sentence-forms the numerically 
definite syllogistic. 

The sentence Some p are q may be equivalently written At least 1 p is a q, 
and the sentence All p are q may be equivalently — if somewhat unidiomatically — 
written At most p are not q. Thus, the numerically definite syllogistic gener- 
alizes the ordinary syllogistic familiar from logic textbooks. Furthermore, the 
sentence There are at least C p may be equivalently written At least C p are p; 
and similarly for There are at most C p. Some authors take the sentences Every 
p is a q and No p is a q to imply that there exists some p. We do not adopt this 
convention. 

We will have occasion below to extend W 1 slightly. Let Af 1+ consist of N 1 
together with the set of C ^formulas of the forms 

3>cx(^p(x) A q(x)) 3> c x(^p(x) A -iq(x)) , . 

3< c x(-^p{x) A q(x)) 3< c x(^p(x) A ->q(x)). 

These formulas correspond to slightly less natural English sentences with negated 
subjects as follows: 

At least C non-p are q At least C non-p are not q 

At most C non-p are q At most C non-p are not q. 

Turning now to Argument ©, we take the two-variable fragment with count- 
ing quantifiers, here denoted C 2 , to be the set of function- free first-order formulas 
featuring at most two variables, but with counting quantifiers allowed. We as- 
sume for simplicity that all predicates in C 2 have arity at most 2. And we define 
the fragment AT 2 to be the set of C 2 -formulas consisting of N 1 together with all 
formulas of the forms 

3> c x(p(x) A 3> D y(q(y) A r(x, y))) 3> c x(p(x) A 3< D y(q(y) A r(x, y))) 
3< c x(p(x) A 3> D y(q(y) A r(x, y))) 3< c x(p(x) A 3< D y(q(y) A r(x, y))), 

where p and q are unary predicates, and r is a binary predicate. Linguistically, 
we think of binary predicates as corresponding to transitive verbs, and the above 
formulas to English sentences of the forms 

At least C p r at least D q At least C p r at most D q , 
At most C p r at least D q At most C p r at most D q, 

respectively. (Again, wc ignore the issue of singular and plural phrases.) Note 
that the sentence-forms in ([6]) may exhibit scope ambiguities; we have resolved 
these by stipulating that subjects always scope over objects. With this stipula- 
tion, we call the fragment of English defined by the sentence-forms ([!]) and |6]) 
the numerically definite relational syllogistic. 

We take it as uncontentious that the correspondence between © and (0| 
provides a rational reconstruction of the notion of validity for arguments in 
the numerically definite syllogistic: such an argument is valid just in case the 
corresponding A/^-sequent is valid according to the usual semantics of first-order 
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logic with counting quantifiers. Moreover, for every A/" 1 -formula, there is another 
A/^-formula logically equivalent to its negation. Hence, the notion of validity 
for A/^-sequents is dual to the notion of satisfiability for sets of A/" 1 -formulas in 
the standard way. Similar remarks apply to A/" 2 and the numerically definite 
relational syllogistic. 

Let C be any logic. The satisfiability problem for C is the problem of de- 
termining whether a given finite set of /^-formulas is satisfiable (has a model); 
likewise, the finite satisfiability problem for C is the problem of determining 
whether a given finite set of C- formulas is finitely satisfiable (has a finite model) . 
A logic C is said to have the finite model property if every finite set of satisfiable 
/^-formulas is finitely satisfiable. Thus, C has the finite model property just in 
case the satisfiability and finite satisfiability problems for C coincide. As usual, 
we take the size of any set $ of ^-formulas to be the number of symbols in 

counting each occurrence of a logical connective or non- logical symbol as 1. 
(Technically, one is supposed to take into account how many non-logical symbols 
occur in $; but for the logics considered here, this would make no difference.) 
The computational complexity of the satisfiability problem and the finite sat- 
isfiability problem for C can then be understood in the normal way. Care is 
required, however, when the formulas of C contain numerical constituents, as is 
the case with the logics considered here. Under unary coding, a positive numer- 
ical constituent C is taken to have size C; under binary coding, by contrast, the 
same constituent is taken to have size [l°g2 CJ + 1 , in recognition of the fact 
that C can be encoded as a bit string without leading zeros. When giving up- 
per complexity bounds, binary coding is the more stringent accounting method; 
when giving lower complexity bounds, unary coding is. In the sequel, binary 
coding will be assumed, unless it is explicitly stated to the contrary. A problem 
is sometimes said to be strongly NP-complete if it is NP-complete (under binary 
coding), and remains NP-hard even under unary coding; and similarly for other 
complexity classes. 

In a logic with negation, a literal is an atomic formula or the negation of an 
atomic formula; in a logic with negation and disjunction, a clause is a disjunction 
of literals. 

Henceforth, all logarithms have base 2. 

3 Complexity of systems between M l and C 1 

In this section, we consider logics containing A/" 1 but contained in C 1 . 

Lemma 1. The satisfiability problem for A/" 1 is NP-hard, even under unary 
coding. 

Proof. If G is an undirected graph (no loops or multiple edges), a 3- colouring of 
G is a function t mapping the nodes of G to the set {0, 1, 2} such that no edge 
of G joins two nodes mapped to the same value. We say that G is 3- colourable 
if a 3-colouring of G exists. The problem of deciding whether a given graph G 
is 3-colourable is well-known to be NP-hard. We reduce it to A/" 1 -satisfiability. 
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Let the nodes of G be {1, . . . , n}. For all i (1 < i < n) and k (0 < k < 3), 
let p\ be a fresh unary predicate. Think of pf(x) as saying: u x is a colouring of 
G in which node i has colour fc" . Let &c be the set of A/^-formulas consisting 



We prove that <I>g is satisfiable if and only if G is 3-colourable. 

Suppose 21 (= $ G . By 0, |p a | < 3. Fix any i (1 < i < n). No a e / 
satisfies any two of the predicates p\, pj, p\, by ©; on the other hand, each 
of these predicates is satisfied by at least one element of p a , by ©; therefore, 
|p 2t | = 3, and each element a of p a satisfies exactly one of the predicates 
pj, pf. Now fix any a € p a , and, for all i (1 < i < n), define i a («) to be 
the unique k (1 < k < 3) such that 21 |= p^[a], by the above argument. The 
formulas (fTQ|) then ensure that i a defines a colouring of G. Conversely, suppose 
that t : {1, . . . , n} — > {0, 1, 2} defines a colouring of G. Let 21 be a structure 
with domain A = {0, 1, 2}; let all three elements satisfy p; and, for all k £ A, let 
p\ be satisfied by the single element k + t(i) (where the addition is modulo 3). 
It is routine to verify that 21 \= <J>g- We note that all numerical subscripts in 
the formulas of $ arc bounded by 3. Thus, NP-hardncss remains however those 
numerical subscripts are coded. □ 

So much for the lower complexity bound for M 1 . Wc now proceed to establish 
a matching upper bound for the larger fragment C . The crucial step in this 
argument is Lemma |3l To set the scene, however, we first recall the following 
textbook result (see, e.g. Paris [6], Chapter 10). Denote the set of non-negative 
rationals by Q + . 

Lemma 2. Let £ be a system of m linear equations with rational coefficients. 
If £ has a solution over Q + , then £ has a solution over Q + with at most m 
non-zero entries. 

Proof. We can write £ as Ax = c, where A is a rational matrix with m rows 
and, say, L, columns, and c is a rational column vector of length m. If b 
is any solution of £ in Q + with k > m non-zero entries, the k columns of A 
corresponding to these non-zero entries must be linearly dependent. Thus, there 
exists a non-zero rational vector b' with zero-entries wherever b has zero-entries, 
such that Ab' = 0. But then it is easy to find a rational number e such that 
b + eh' is a solution of £ in Q + with fewer than k non-zero entries. □ 

The question naturally arises as to the corresponding bound when solutions 
are sought in N, rather than Q + . Here, the argument of Lemma [2] no longer 
works, and the bound of m must be relaxed. 



of 



3< 3 x(p(x) Ap(x)) 

{3< x(p{(x) Ap k i{x)) 1 < i < n,Q < j < k < 3} 
{3>ia;(p 4 fe (.T) Ap(x j) | 1 < i < n, < k < 3} 
{3< x(p^(x) Apj(x)) | is an edge of G, < k < 3} 



(7) 

(8) 
(9) 
(10) 
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Definition 1. A Boolean equation is any equation of the form a\X\ + ■ ■ ■ a n x n = 
c, where each at (1 < i < n) is either or 1, and c is a natural number. 



Lemma 3. Let £ be a system ofm Boolean equations in L variables. If £ has a 
solution overN, then £ has a solution overN with at most m log(L + 1) non-zero 
entries. 

Proof. We write £ as Ax = c, where A is a matrix of Os and Is with m rows and 
L columns, c is a column vector over N of length m, and x = (xi, . . . xl) t ■ If £ 
has a solution over N, let b = (bi, . . . , 5l) t be such a solution with a minimal 
number k of non-zero entries. We show that 



k < mlog(L + 1). (11) 

This condition is trivially satisfied if k = 0, so assume k > 0. Furthermore, by 
renumbering the variables if necessary, we may assume without loss of generality 
that bj > for all j (1 < j < k). Now, if / C {1, . . . , k}, define v/ to be the 
77i-element column vector (v±, . . . , v m ) T , where 

Vi A ' ,'- 

That is, v/ is the sum of those columns of A indexed by elements of /. Since 
each Vi (1 < i < m) is a natural number satisfying 

Vi < L, (12) 

the number of vectors vj (as / varies over subsets of {1, . . . , k}) is certainly 
bounded by [L + l) m . So suppose, for contradiction, that k > mlog(L + 1). 
Then 2 k > (L + l) m , whence there must exist distinct subsets /, I' of {1, ... , k} 
such that v/ = \rji. Setting J = I \ I' and J' = I' \ I, it is evident that J and 
J' are distinct (and disjoint), again with v,/ = vj<. By interchanging J and J' 
if necessary, we may assume that J ^ 0. Now define, for all j (1 < j < L): 



bj - 1 if j e J 
bj + 1 if j e J' 
bj otherwise, 



and write b' = (b[, . . . , b' L ) T . Since J and J' arc disjoint, the cases do not 
overlap; and since the bj are all positive (1 < j < k), the bj all lie in N. 
Moreover, 

Ab' = Ab - v,/ + vjv = Ab. 

Since J is nonempty, min{6^|j £ J}, is strictly smaller than min{6j|j G J}. 
Generating b", b"', etc. in this way (using the same J and J') will thus even- 
tually result in a vector — say, b* — with strictly fewer non-zero entries than b, 
but with Ab* = Ab — a contradiction. □ 
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By way of a digression, we strengthen Lemma [3J to obtain a bound which 
does not depend on L. 

Proposition 1. Let £ be a system of m Boolean equations. If £ has a solution 
over N, then £ has a solution over N with at most |mlogm+l non-zero entries. 



Proof. The case m = 1 is trivial: if £ has a solution, then it has a solution with 
at most one non-zero entry. So assume henceforth that m > 1. 

In the proof of Lemma [3J the inequality (fT2")) can evidently be strengthened 

to 

Vi < k. 

Proceeding exactly as for Lemma [3J we obtain, in place of the inequality 

k < mlog(fc + 1). 

Hence, for k positive, we have 

k 



i 0g (fc + r 



< m. (13) 



Now the left-hand side of (fT3"|) is greater than or equal to unity, and since the 
function x i— > xlogx is monotone increasing for x > e _1 , we can apply it to 
both sides of (fl"3l) to obtain 



kZ(k) < mlogm, (14) 

where, for all k > 0, 

_ logfc-loglog(fc + l) 

Z{k) - tog^Tij ' 

It is straightforward to check that Z is monotone increasing on the positive 
integers, and that Z(k) — > 1 as k — > oo. (Indeed, for x > 0, the function 
x i — ► logx/log(2: + 1) is monotone increasing with limit 1 as x tends to oo; and 
for x > 2 e — 1, the function x i— > loglog(x+l)/log(x + l) is monotone decreasing 
with limit 0.) 

We may now establish that /c < |mlogm + 1. Calculation shows that 
1/Z(7) w 2.4542 < |. Therefore, since is monotone increasing, (fT*!)) yields, 
for fc > 7, the inequalities fc < m\ogm/Z(k) < m\ogm/Z(7) < Imlogm. 
Obviously, if k < 6, we have k < |mlogm + 1, since m > 2 by assumption. □ 

The proof of Proposition Q] actually shows a little more than advertised: 
for any real c > 1, there exists a d such that, if £ is a system of m Boolean 
equations with a solution over N, then £ has a solution over N with at most 
cm log m + d non-zero entries. (As c approaches unity, the required value of d 
given by the above proof quickly becomes astronomical.) It follows that none of 
these bounds is optimal, in the sense of being achieved infinitely often. On the 
other hand, the next lemma shows that, for systems of Boolean equations with 
variables ranging over N, the bound of m reported in Lemma [2] is definitely not 
available, a fact which will prove useful in Section [5] 
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Lemma 4. Fix m > 6. Let A be the to x (to + \)-matrix given by 



A = 



1 


1 


1 
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in which a pattern of three Is is shifted right across the first (to — 1) rows, and 
the last row contains the seven entries shown on the left followed by (m — 6) 0s. 
Let c be the column vector of length m given by 

c = (3,3,...,3,4) T 

consisting of (to — 1) 3 s and a single 4. Then the unique solution of the system 
of Boolean equations Ax = c over N is the column vector (1, . . . , 1) T consisting 
of (to + 1) Is. 

Proof. Evidently, A(l, . . . , 1) T = c. Conversely, suppose b = (b±, . . . , b m +i) T is 
any solution of Ax = c in N. From the first row of A, b\ + 62 + &3 = 3, whence 
bi, 62, ^3 are either (i) the integers 0, 0, 3 in some order, or (ii) the integers 0, 1, 2 
in some order or (hi) the integers 1,1,1. By considering rows 2 to to — 1 of A, 
it is then easy to see that, in every case, these three values must recur, in the 
same order, to the end of the vector: that is, b must have the form 

(£>1,&2,£>3,&1i&2,&3,&1) ■ ■ -) T - 

From the last row of A, then, 36i + 62 = 4. Thus, bi, 62,^3 are certainly not 
3, 0, 0, in any order. Suppose, then, 61, 62, 63 are 0, 1, 2, in some order. If 61 = 0, 
then 3bi + 62 is at most 2; if b\ = 1, then 3&i + 62 equals either 3 or 5; and if 
61 = 2, then 3&i + 62 is at least 6. Thus, 61, 62, 63 are not 0, 1, 2, in any order, 
whence b = (1, . . . , 1) T as required. □ 

Returning to the main business of this section, we have: 

Theorem 1. The fragment C 1 has the finite model property. Moreover, the 
satisfiability (= finite satisfiability) problem for C 1 is in NP. 

Proof. If <fi, ip and ir are (^-formulas, denote by <j)[ir/4>] the result of substituting 
7r for all occurrences of ip (as subformulas) in <f>. 

It is straightforward to transform any C^-fbrmula 0, in polynomial time, 
into a closed C^-formula <fi' containing no occurrences of equality, no proposition- 
letters and no individual constants, such that <f> is satisfiable over a given domain 
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A if and only if fa is satisfiable over A. Indeed, we may further restrict attention 
to such fa having the form 

f\ ^ iCi xfa, (15) 

l<i<rn 

where the symbols iXj are any of {<, >, =}, and the fa are quantifier free. For 
suppose fa does not have this form: we process fa as follows. Choose any quan- 
tified subformula ip = B^cxtt with tt quantifier-free, and non-dctcrministically 
replace fa by either fa[T/ip] A tp or fa[±/ip] A -i^; then repeat this procedure 
until all embedded quantification has been removed. The result will be, modulo 
trivial logical equivalences, a formula of the form (fT"5|k and fa will be satisfiable 
over a given domain A if and only if some formula of the form (fT"5|) obtained 
in this way is satisfiable over A. Thus, any polynomial-time non-deterministic 
algorithm to check the (finite) satisfiability of formulas of the form (fT5"|) eas- 
ily yields a polynomial-time non-deterministic algorithm to check the (finite) 
satisfiability of (^-formulas. 

Fix cj) to be of the form ()15[) . then, with no individual constants, proposi- 
tion letters or equality. Suppose that the unary predicates occurring in <f> are 
Pl,...,pi. Call any formula of the form 7r = ±pi(x) A • • ■ A ±pi(x) a 1-type. 
Let the 1-types be enumerated in some way as tti, . . . , 7T£, where L = 2 l . Any 
structure 21 interpreting the p%, ...pi can evidently be characterized, up to iso- 
morphism, by the sequence of cardinal numbers (ai, . ..at), where ctj is the 
cardinality of the set {a £ A : 21 |= 7Tj[a]} for all j (1 < j < L). Denote this 
sequence by a (21). For all i (1 < i < m) and j (1 < j < L), define 

= f 1 if |= TTj -> ^ 

[ otherwise. 

Interpreting the arithmetic operations involving infinite cardinals in the ex- 
pected way, if 21 |= <fr, then a(2l) is a simultaneous solution of 

O14X1+ •■■+ cl\,lXl IX i Ci 

: : : : (16) 

a m,l2 ; l+ ■••+ Clm,LXL t<m Cm: 

with at least one non-zero value. Conversely, given any solution ai, . . . (Xl of (|16p 
with at least one non-zero value, we can construct a model 21 of 4> such that 
a(2t) = (ai,... a/,). Setting C = max{Ci | 1 < i < m}, we see that, if 
a\, . . .ckl is a solution of (|16p . then so is /3i, . . . /3l, where f3j = min(aj, C) for 
all j (1 < j < L). It follows easily that C 1 has the finite model property. 

By Lemma (|16|) has a solution over N if and only if it has a solution 
in which at most m log (L + 1) < m(l + 1) < |^| 2 values are nonzero. (The 
requirement that the solution in question contain at least one non-zero value 
can easily be accommodated by adding one more inequality, if necessary.) By 
the reasoning of the previous paragraph, we may again assume all these non- 
zero values to be bounded by C. But any such solution can be written down 
and checked in a time bounded by a polynomial function of the size of <f>. □ 
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Corollary 1. The satisfiability problem (= finite satisfiability problem) for any 
logic between M 1 and C 1 is strongly NP-complete. 

It follows that determining the validity of arguments in the numerically 
definite syllogistic is a co-NP-complcte problem. Equipping this fragment with 
relative clauses, for example, 

At most 3 artists who are not beekeepers are carpenters, 

evidently has no effect on the complexity of determining validity, since it does 
not take us outside the fragment C 1 . Nor indeed has the addition of proper 
nouns, for the same reason. In fact, it is straightforward to show that the 
complexity bound for satisfiability (and finite satisfiability) given in Theorem [T] 
applies to extensions of C 1 featuring a large variety of other quantifiers. The only 
requirement is that the truth- value of a formula Qx(i/)i, . . . ,ip n ) be expressible 
as a collection of 'linear' constraints involving the cardinalities (possibly infinite) 
of Boolean combinations of the sets of elements satisfying the ipi, . . . , ip n . In 
particular, we obtain the same complexity for extensions of the numerically 
definite syllogistic featuring such sentences as 

There are more artists than beekeepers 
Most artists are beekeepers 

There are more than 3.7 times as many artists as beekeepers 
There are finitely many carpenters. 

The details are routine, and we leave them to the reader to explore. Extensions 
of the syllogistic with 'proportional' quantifiers are considered by Peterson [8]; 
however, no complexity-theoretic analysis is undertaken in those papers, and 
certainly no analogues of Lemma [3] or Proposition [1] are provided. 

We conclude this section with some remarks on a related problem. Denote by 
S the prepositional language (with usual Boolean connectives) over the count- 
able signature of proposition letters p\ , p% , . . . . A probability assignment for S is a 
function P : S — ► [0, 1] satisfying the usual (Kolmogorov) axioms. The problem 
PSAT may now be defined as follows. Let a list of pairs (fa, qi), . . . , (fan, Qm) 
be given, where each fa is a clause of S, and each q^ is a rational number: decide 
whether there exists a probability assignment P for S such that 

P(fa) = qi for all i (1 < i < m). (17) 

The size of any problem instance (fa, qi), . . . , (fan, q m ) is measured in the obvi- 
ous way, with binary coding of the qi. By comparing (fT7|) with (fT5|) . we see that 
the satisfiability problem for C 1 is, as it were, an 'integral' version of PSAT. The 
problem fc-PSAT is the restriction of PSAT to the case where all the clauses fa 
have at most k literals. 

Georgakopoulos et al. [2j show that 2-PSAT is NP-hard, even under unary 
coding, and that PSAT is in NP. (Hence, fc-PSAT is strongly NP-complete for all 
k > 2.) The proof that 2-PSAT is NP-hard is essentially the same as the proof 
of Lemma [TJ Moreover, the proof that PSAT is in NP is similar in structure to 
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the proof of Theorem[TJ Suppose we are given an instence of PSAT in which the 
<f>i mention only the proposition letters p%, . . . ,pf. the challenge is to show that, 
if there exists a probability assignment P satisfying (TTT|) . then there exists one 
in which the number of formulas ±pi A • • • A ±pi having non-zero probability 
is polynomially bounded as a function of I. But this is easily guaranteed by 
Lcmma[21 By contrast, Lemma[2]docs not suffice for the proof of our Thcorcm[TJ 
because it docs not guarantee the existence of an integral solution of the relevant 
equations — hence the need for Lemma [3] (or Proposition [1]). We return to this 
matter in Sectional 

4 Complexity of systems between J\f 2 and C 2 

We now turn our attention to logics containing A^ 2 but contained in C 2 . 

Lemma 5. The fragment J\f 2 has the finite model property. 

Proof. Suppose 21 |= $, where $ is a set of A/" 2 -formulas. If G $ is of the form 
3>DXip(x), let A^ be a collection of D individuals satisfying ip in 21, and let 

A* = \J{A $ | <j> £ $ is of the form 3>dxi/j(x)}. 

As in Theorem [TJ let the unary predicates occurring in $ be pi, . . . , pi, and let 
7Ti(x), . . . , ttl{x) be all the formulas of the form ±pi(x)A- ■ -A±pi(x), enumerated 
in some way. Then A is the union of the pairwise disjoint sets Ai, . . . Al, where 
Aj = {a G A | 21 |= 7Tj[a]}. Let C be the largest quantifier subscript occurring 
in Evidently, for all j (1 < j < L), 

\AznAj] < < 

whence we may certainly select a set of elements Aj such that A$nAj C Aj C Aj 
and 

lA^mind^-I.C^I + l). 

Thus A' = A[ U • ■ • U A' L is finite. We define a structure 21' over A' as follows. 
Interpret the unary predicates so that, for all j (1 < j < L) and all a' G A'-, 
21' |= 7Tj[o']. Interpret each binary predicate r in such a way that, for all a' G A' 
and all j (1 < j < L), 

\{b' G A; : 21' h r[a',b']}\ = mm{\{b G Aj : 2t (= r[o', &]}|, + 1). 

This is evidently possible. Consider any formula 9(x) of either of the forms 

3<DV(q(y) Ar(x,y)) 3> D y(q(y) A r(x, y)) (18) 

with D < C + 1 < \$>\C + 1. It is immediate from the construction of 21' that, 
for all a' G A', 

21 1= 6[a'] 21' |= fl[o']. (19) 
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Hence, if the numerical subscript D in satisfies D < C, we also have: 

21 ^ e[a'] => 2t' ^ e[a'}. (20) 

We show that 21' |= for all ^ e $. If is an A/^-formula, this result is 
immediate from the construction of 21'. If </> has the form 3>u'x(p(x) A &(x)), 
with having one of the forms (Tl8|) . the result follows from (| 19|) and the fact 
that C A' . If has the form 3<i>ix(p(x) A with 8(x) having one of 

the forms (18]), tnc result follows from □ 

Inspection of the proof of Lemma [5] shows that the size of the constructed 
model is bounded by an exponential function of the size of <f>. Hence, the 
satisfiability (= finite satisfiability) problem for M 2 is in NEXPTIME. 

It is well-known that the larger fragment C 2 lacks the finite model property. 
For example, the formula 

3x\fy^r(x 1 y) A \/x3yr(y, x) A \fx3<iyr(x, y) (21) 

is satisfiable, but not finitely so. Thus, the satisfiability problem and the finite 
satisfiability problem for C 2 do not coincide. Nevertheless, the following was 
shown in Pratt-Hartmann [10] . 

Theorem 2. The satisfiability problem and the finite satisfiability problem for 
C 2 are both in NEXPTIME. 

This upper bound applies even when counting quantifiers arc coded in binary — 
a fact which is significant here. In this section, we provide a matching lower- 
bound for A/" 2 , which is slightly surprising given the latter fragment's expressive 
limitations. 

A tiling system is a triple (C, H, V), where C is a finite set and H, V are 
binary relations on C. The elements of C are referred to as colours, and the 
relations H and V as the horizontal and vertical constraints, respectively. For 
any integer N, a tiling for (C, H, V) of size N is a function t : {0, . . . , TV — l} 2 — > 
C such that, for all i, j in the range {0, . . . , N — 1}, the pair {t(i,j),t(i + 1, j)) is 
in H and the pair (t(i, j),t(i, j + 1)) is in V, with addition interpreted modulo 
N. A tiling of size N is to be pictured as a colouring of an N x N square grid 
(with toroidal wrap-around) by the colours in C; the horizontal constraints H 
thus specify which colours may appear 'to the right of which other colours; the 
vertical constraints V likewise specify which colours may appear 'above' which 
other colours. By a C '-sequence, we simply mean a sequence i = io, .. .i n -i of 
elements of C (repeats allowed) . The C-sequence i is an initial configuration of 
a tiling t if i = <(0, 0), . . . , t{n - 1,0). 

Theorem 3. The satisfiability problem for M 2 is NEXPTIME-hard. 

Proof. Let (C, H, V) be a tiling system and p a polynomial. For any C-sequence 
i of length n, we construct, in time bounded by a polynomial function of n, a 
set 0j of AA 2 -formulas such that Oi is satisfiable if and only if (C, H, V) has a 
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tiling of size 2 p (") with initial configuration i. Thus, we may regard 8i as an 
encoding of the C-sequence i with respect to the tiling system (C, H, V) and 
the function 2 p (") . The existence of such an encoding suffices to show that the 
satisfiability problem for TV 2 is NEXPTIME-hard. 

To motivate the technical details, we suppose, provisionally, that (C, H, V) 
does have a tiling of size 2 p (") with initial configuration i, and we construct the 
encoding 0i in parallel with a structure 21 in which 0i is true. As we do so, we 
show that, conversely, if 0i is satisfiable, then (C, H, V) has a tiling with the 
required properties. The construction of 0i proceeds in two stages. In the first 
stage, we employ familiar techniques to obtain an encoding in an extension of 
A/" 2 . In the second stage, we employ some less familiar methods to obtain an 
encoding in Af 2 . 

First stage: For convenience, we set N = and s = 2(p(n) 2 + p(n) + 1). 
Let A\ be the set of pairs of integers in the range {0, . . . ,N — 1}, and let 
A2 be the set of pairs of the forms (i, T) and (i, _!_), where i is an integer in the 
range {1, . . . , s} and T, _L arc any distinct symbols. Evidently, 

\Ai\ = N 2 
\A 2 \ = 2s. 

Finally, let A3 be a set disjoint from A\ and A2 satisfying 

|A 3 | = (M-l)A 2 , 

where M = \C\. We refer to A\ as the grid, A2 as the notebook, and A3 as the 
rubbish dump. Our structure 21 will have domain A = A\ U A 2 U A3. 

Any natural number I in the range {0, . . . , N — 1} can be written uniquely 
as I = J2i=o 1 ^i^ 1 , where 6j € {0, 1}. We say that 6j is the ith digit of I. 
Thus, digits are enumerated in order of increasing significance, starting with 
the zeroth. Let q, Xq, . . . , A p ( n )_!, Xq, ■ . ■ , X p ( n ^i be new unary predicates, 
interpreted in the structure 21 as follows: 

Xi % = {(l,m) e A 1 I the ith digit of I is 1} 
Xf = {(l,m)eA 1 I the ith digit of lis 0}. 

We may read as "has an x-coordinate whose ith digit is 1" , and Xi as "has 
an x-coordinate whose ith digit is 0". Then 21 (= Qq.x, where ®o x is the set 
of formulas 

3< N 2xq(x) 
3> N 2 /2 xX i (x) 
3> N 2/ 2 xXi(x) 
V.t(A 4 (.t) -» q(x)) 
\fx(Xi(x) -» q(x)) 
y X {X l {x) ^Xi(x)) 



(0 < i < p(n)) 
(0 < i < p(n)) 
(0 < i <p(n)). 
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Conversely, in any model of &o,x, exactly N 2 elements satisfy q, and the ex- 
tensions of Xi and Xi are complementary with respect to that collection of 
elements, for all i (0 < i < p(n)). 

Further, let Xq , . . . , X*, n \ be new unary predicates, interpreted in the struc- 
ture 21 as follows: 

Xf = Ax \ X* 



xf = {x« % n ■ • • n x^) \x* (l < i < P (n)) 

p(n) - A I I • • • 11 -*p(n)-l • 



Thus, the predicate X* can be read as "has an x-coordinate in which all digits 
before the ith, but not the zth digit itself, are 1" . Finally, for all i, j (0 < i < j < 
p(n)), let Xfj and X~^ be new unary predicates, interpreted in the structure 21 
as follows: 

x+* = x*»nx j * 

Xr* = xf\X*. 

Let Tx be the set of first-order formulas \/x(q(x) — > 7), where 7 is any of the 
following clauses: 

X*(x)v[X t (x)V \/ X k (x)] (0<i<p(n)) 

0<k<i 

^; ( n)(»)v[ V 

0<fe<p(ri) 

I+V^WVI.WV \/ X k (x)] (0<i<j<p(n)) 

Q<k<i 

Xr.v[X j (x)vX i (x)V \/ X k (x)] (0 < i < j < p{n)). 

Q<k<i 

(The square brackets are for legibility.) It is immediate that 21 \= Tx- The 
formulas Tx, in effect, establish sufficient conditions for satisfaction of the pred- 
icates X* etc. in terms of the predicates Xi and Xi. Warning: these formulas 
are not in the fragment Af 2 . 

Similarly, let Yq, . . . , Yp( n )_i, Yq, . . . , be new unary predicates, inter- 

preted in the structure 21 as follows: 

Y^ = {(l,m) e A\ I the ith digit of m is 1} 
Y^ = {(l,m) G Ai I the ith digit of to is 0}. 

We may read Y as "has a y-coordinate whose ith digit is 1" , and Y as "has a 
y-coordinate whose ith digit is 0" . Let 80, y be the set of formulas constructed 
analogously to Oo,x , but with "X" replaced systematically by "Y" ; and let 
Oo = Qo,x U 9 ,y. Further, let Y* (0 < i < p(n)), Y+ (0 < i < j < p{n)) 
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and Y i ■ (0 < i < j < p{n)) be new unary predicates, interpreted analogously 
to their X-counterparts; let the formulas Ty be constructed analogously to Tx', 

and let r = r x ury. 

We may now impose a toroidal grid structure on Ai, with the aid of a pair 
of binary predicates h and v. Let 21 interpret h and v as follows: 

ti* = {({l,m),(l + l,m))\0<l<N, 0<m<N} 
v % = {({l,m),(l,m+l))\0<l<N, 0<m<N}, 

where the addition is modulo N. It is straightforward to check that 21 |= 
©i.x U @i,y, where ©i.x is the set of AA 2 -formulas 



\/x(q(x) — > 


3y(q(y) Ah(x,y))) 




(22) 


Vx(X*(x)- 


^^3y(h(x,y)AX i (y))) 


(0 < i <p(n)) 


(23) 


Vx(X*(x)- 


-+^3y(h(x,y)AX j (y))) 


(0<j<t< P (n)) 


(24) 


Vx(X+(x) 


^^3y(h(x,y)AX j (y))) 


(0 < i < j < p(n)) 


(25) 




-^^3y(h(x,y)AX j (y))) 


(0<i<j< p(n)) 


(26) 


Vx{Yi(x) - 


>^3y(h(x,y)AY l (y))) 


(0 < i <p(n)) 


(27) 




> ^3y(h(x,y) AYi(y))) 


{0<i<p(n)), 


(28) 



and 0i. y is defined analogously, but with "X" and "Y" interchanged, and a h" 
replaced by "v" . Let 0i = ©i xU0i y. Formula (f22|) ensures that every element 
a satisfying q is related via h to some other such element b. In the presence 
of ©o and T, (|2"3")) - (|2"6"|) then ensure that the 'x-coordinate' of b is one greater 
(modulo N) than the '^-coordinate' of a; and (|27[) - (|28)) likewise ensure that a 
and b have the same 'y-coordinate'. Similar remarks apply, mutatis mutandis, 
to the formulas ©i.y. Since ©o ensures that at most iV 2 elements satisfy q, it 
follows that, in any model of ©o uru©i, the extension of q contains exactly one 
element with any given pair of (x, y)-coordinates in the range {0, . . . , N — 1}, 
and moreover that this collection of elements is organized by the interpretations 
of h and v into an N x N toroidal grid in the expected way. 

Having set up our grid, we proceed to colour it. Recall that the 'rubbish 
dump', A3, is a set containing (M — 1)N 2 elements, where M = \C\. Let 
C = {ci, . . . , Cm}- Assuming, provisionally, that (C, H, V) has a tiling of size 
N with initial segment i = io, . . .i n -±, choose some such tiling t. For all k 
(1 < k < M), let rifc < N 2 be the number of grid-squares to which t assigns 
colour Cfe, and let Bk be a subset of A3 with cardinality N 2 — n^. From the 
cardinality of A3, and the fact that n k — N 2 , we may choose the Bk to be 
pairwise disjoint; and, in that case, the Bk will together exactly cover A3. Now 
treat the elements of C as new unary predicates, and set 

cf = {a G A\ I t assigns the colour Ck to a} U Bk, 

for all k (1 < k < M). Let o be a new unary predicate and set 



= Ai U A 3 . 
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It is simple to check that 21 |= 82, where 02 is the set of formulas 



\/x(q(x) — > o(x)) 
3< AIN 2xo(x) 
3> N 2xc k (x) 
Vx(ck(x) — > o(x)) 
Vx(c fc (a;) -> ->Cfc/(x)) 



(1 < fc < M) 
(1 < fc < A/) 
(1 < fc < fc' < Af). 



Conversely, in any model of 02, the interpretations of the predicates Ck form a 
pairwise disjoint cover of the interpretation of o, and, therefore, of the interpre- 
tation of q. 

Turning to the input i = £0, . . . , i n -i, let °cb ■ ■ ■ , On-i be new unary predi- 
cates. We interpret these so as to pick out the squares (0, 0), . . . , (n — 1, 0) of 
the grid, respectively. Formally: 



for all £ (0 < £ < n) . It is a simple matter to write formulas specifying the coor- 
dinates of these predicates. For example, define ©3.0 to be the set of formulas 



(Remember that £0, being an element of C, is also a predicate interpreted by 21.) 
It is easy to see that 21 |= ©3,0- Conversely, in any model of T U ©0 U @i U ©2 U 
©3,0, 00 must be interpreted as the (unique) element in the extension of q with 
'coordinates' (0,0); moreover, that element must be assigned the 'colour' £q. 
Let the sets of formulas ©3,1, . . . , ©3,n-i be constructed analogously, fixing the 
interpretations of 01, . . . ,o„_i, respectively, with colours assigned as specified 
in i; and let ©3 be ©3,0 U • • • U 03 )n _i. 
Finally, let ©4 be the set of formulas 



Va?( Cj -(x) -» -ay(cfc(y) A h{x,y))) (1 < j < M, 1 < k < M, (j,fc) £ H) 
Vx( Cj (x) -» -ay(cfc(y) A v(x, y))) (1 < j < M, 1 < fc < M, (j, fc) ^ 1/). 



Since the interpretations of the Ck were taken from a tiling t, we certainly have 
21 |= ©4. Conversely, any model of T U ©o U • ■ ■ U ©4 defines a tiling for (C, H, V) 
of size N with initial segment i, in the obvious way. 

The set of formulas T U ©0 U ■ ■ • U ©4 is almost the required encoding ©i: 
the only problem is that the formulas T arc not in the fragment Af 2 . Massaging 
them into the appropriate form is the task of the second stage of the proof. 

Second stage: Recall that, in the first stage, we gave each of the predicates Xi, 
and Yi (0 < £ < p(n)), a "barred" counterpart Xi, and Y;, with a complementary 



{(i,0)} 



3x(o (x) A q(x)) 
\/x(o (x) — > Xi(x)) 
\fx(o Q (x) -> Fi(x)) 
\/x(o (x) -» £ (a;)). 



(0 < £ < p(n)) 
(0 < £ < p(n)) 
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interpretation in 21 with respect to q % = A\. Moreover, we provided a set of 
formulas 9o, guaranteeing that such pairs have complementary interpretations 
with respect to the extension of q. Let us do the same for the predicates X* , 
X^~j, X~j, Y*, Y^j, Y~- (with indices in the ranges specified above), letting 8 5 
be the requisite set of formulas. The construction of 85 is completely routine. 

Now enumerate the various predicates X i: X*, Xf-, X~-, Y i7 Y*, Y^, Y~-, 
in some order, as 

qi,...,q s . (29) 

(There are indeed s = 2(p(n) 2 +p(n) + 1) of these, if you tot them all up.) And 
enumerate their barred counterparts, in the corresponding order, as 

9i, •■■,<&• (30) 

Recall that the 'notebook', A2, consists of the elements (1, T), . . . , (s, T) and 
(1, _L), . . . , (s, !_). Referring to the enumerations (|29|) and ([30]) . think of the 
element (ft, T) as standing for the atom qh(x), and of the element (ft, -L) as 
standing for the atom q~h(x), for all ft (1 < h < s). Let l,lx,...,l s and h, . . . , l s 
be new unary predicates, interpreted in 21 as follows: 

1 % = A 2 

l* = {(h,T)} (l<h<s) 

% = {(h,±)} (l<h<s). 
It is simple to check that 21 |= 06, where ®$ is the set of formulas 

3< 2s xl(x) 

3xlh(x) 3xlh{x) (1 < h < s) 

Vx{l h {x) -» l{x)) Vx(l h (x) -> l(x)) (l<h<s) 

Vx(lh(x) -» -^l h -{x)) Vx(l h (x)->-I h ,(x)) (l<h<h'<s) 
Vx(l h (x) -> -^lh'(x)) (1 < h < s,l < ti < s). 

Conversely, in any model of 06, the predicates lx, . . . , l s , l±, . . . , l s are uniquely 
instantiated, and pick out the 2s elements satisfying I. 

Fix any formula Vx(q(x) — > 7) G T. Note that the clause 7 is actually a 
disjunction of atoms featuring only the predicates in (|29]) and (|30|) . Let r 1 be 
a new binary predicate. Since 21 |= Vx(q(x) — > 7), define, for each a £ A\, the 
element a 7 £ A 2 as follows. Choose a literal (atom) L of 7 satisfied by a: if L is 
qh{x) for some h (1 < h < s), set a 7 = (/i, T); if, on the other hand, L is q~h{x) 
for some ft. (1 < h < s), set a 7 = (ft, _L). Think of the object a 7 as representing 
some literal of 7 satisfied by a. Having defined a 7 for all a S A\, set 

r 7 = {(a,s 7 )|aeii}. 

It is then easy to check that 21 |= 7 , where 7 consists of the formula 

Vx(q(x) -> 3y(l(y) A r 1 (x, y))), (31) 
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together with the following formulas, for all h (1 < h < s): 



Vx(q(x) -> Sy{l h {y) Ar 7 (x,y))) 
Vx(q(x) -> -^3y(l h (y) Ar 7 (x,y))) 
Vx(q h (x) -> ^3y(l h (y) Ar 7 (x,y))) 
Vx(q h (x) -> -^3y(l h (y) Ar 7 (x,y))) 



(if qh{x) not a literal of 7) 
(if g?t(x) not a literal of 7) 



(32) 
(33) 
(34) 
(35) 



Conversely, in any model of 06 U0 7 , (|3"Tj) guarantees that every object a in 
the extension of q is related via r 7 to some object a 7 in the extension of / 
(representing a literal); (|3"2"|) and (|3"3"|) then state that the literal represented by 
a 7 is a literal of the clause 7; and (|34p and (|35[) state that a satisfies this literal. 
Together, Go, Oi, 65, 06 and 7 thus guarantee that any object satisfying q 
also satisfies the clause 7; in other words: 



Let 6 7 = U{©7 I Va:(g(x) -> 7) £ T}. 

Let Oi = 0o U • • • U 87. The construction of 0i evidently proceeds in time 
bounded by a polynomial function of the length n of i. Every formula in 6^ 
is an A/" 2 -formula, modulo trivial logical manipulations. And (C,H,V) has a 
tiling of size N = 2 P ^ with initial segment i if and only if Oi is satisfiable. □ 

We remark that the proof of Theorem [3] makes essential use of binary coding 
of quantifier subscripts. For example, the subscript MN 2 has size \_2p{n) + 
log M J + 1, and hence is bounded by a polynomial function of n. 

Corollary 2. The satisfiability problem and finite satisfiability problem for any 
logic between M 2 and C 2 are both NEXPTIME-complete. 

It follows that determining the validity of arguments in the numerically 
definite relational syllogistic is a co-NEXPTIME-complctc problem. Equipping 
this fragment with relative clauses, for example, 

At most 3 artists whom at least 4 beekeepers admire despise at least 
5 dentists who envy at most 6 electricians, 

evidently has no effect on the complexity of determining validity, since it does 
not take us outside the fragment C 2 . Nor do proper nouns or negated verb- 
phrases, for example 

At most 3 artists do not despise (= fail to despise) at least one beekeeper 
At least 3 artists despise Fred. 

In fact, we may add a certain amount of anaphora to the fragment while still 
remaining within C 2 , thus: 

At most 3 artists who despise themselves admire at least 4 beekeepers 
who envy them, 



8 U 9i U 9 5 U 9 6 U e 7 |= Vx(g(x) -» 7). 



(36) 
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though care has to be taken in specifying the precise interpretation of pronouns 
(see Pratt- Hartmann [9]). However, the complexity-theoretic consequences of 
extending the repertoire of quantifiers in C 2 — for example, to include such con- 
structions as "for most x, — are unknown. 

The following related facts are shown in Pratt-Hartmann and Third [12] : if 
sentences involving transitive verbs are added to the ordinary syllogistic (with- 
out numerical quantifiers), the satisfiability problem for the resulting fragment 
remains in PTIME; if sentences involving both transitive verbs and relative 
clauses are added to the ordinary syllogistic, the satisfiability problem for the 
resulting fragment is EXPTIME-complete. 

Although adding relative clauses to A/" 2 does not increase the complexity of 
satisfiability, it nevertheless has other repercussions of a logical nature. For one 
thing, we loose the finite model property: the sentences 

At least 1 p rs at most ps 

At most ps are ps which at most ps r 

At most ps r at least 2 ps, 

which, in essence, reproduce the content of formula ([2Tj) . arc satisfiable, but 
not finitely so. Hence the satisfiability and finite satisfiability problems, though 
both NEXPTIME-complctc, arc distinct. Interestingly, the addition of relative 
clauses also affects the question of strong NEXPTIME-complctencss. Inspection 
of the proof of Theorem [3] shows that binary coding of quantifier subscripts is 
required only to overcome the lack of Boolean connectives in A/" 2 (specifically, in 
simulating the effect of the formulas T with the formulas of 05-07, or stating 
that the colours must exhaust the grid). Adding relative clauses obviates the 
need for these contortions; and it is in fact easily checked that the satisfiabil- 
ity problem and the finite satisfiability problem for this fragment are strongly 
NEXPTIME-complctc. This difference is noteworthy, because some other frag- 
ments with counting quantifiers discussed in the literature have satisfiability and 
finite satisfiability problems whose complexity is insensitive to whether quanti- 
fier subscripts are coded in unary or binary (Pratt-Hartmann |10[ lllj). 

5 Numerically definite syllogisms 

Various proof-systems have been proposed in the literature for determining en- 
tailments in the numerically definite syllogistic, based on numerical generaliza- 
tions of the traditional syllogisms. Good examples are the natural deduction 
systems of Murphree [I] , for the language J\f 1+ , and of Hacker and Parry [5] for 
the language AT 1 . In this section, we use the results of the foregoing analysis to 
explore the possibility of developing a system of numerically definite syllogisms 
which is complete, in the sense that all valid sequcnts become derivable. 

We start by adapting some familiar Aristotelian syllogisms in the obvious 
way. In the sequel, L, L\, L2 and L3 range over non-ground literals of C 1 — i.e. 
formulas of the forms p{x) or ->p(x). Thus, the formulas of Af 1+ simply have 
the forms 

3> c x{L 1 AL 2 ) 3<c x {L\ A L 2 ). 
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For convenience, we regard 3>cx(Li AL 2 ) and 3>cx(L 2 ALi) as identical, and 
similarly for their 3<c-quantificd counterparts. In addition, we allow negative 
numbers to appear in quantifier subscripts, again with the obvious semantics: 
for C < 0, 3<cx(Li A L 2 ) is trivially false, and 3>cx(Li A L 2 ) trivially true. If 
L is a literal, let L denote its opposite — that is, the literal formed by removing 
any double negation from -iL. Under these conventions, define M. to be the 
natural deduction system with (i) axiom schemas 

3>ox(iiAL 2 ) 3< c x(LAL), 

for all C > 0, (ii) rules of inference 

3< c x{L 1 AL 2 ) 3< d x(L 2 AL 3 ) 

^<(c+D)x(L 1 A L 3 ) 
3> c x(L 1 AL 2 ) 3< D x(L 2 AL 3 ) 

3>( C -D)x(L 1 A L 3 ) 
3< c x(L 1 AL 1 ) 3> D x(L 1 AL 2 ) 

3 <(c-D)x(L 1 A L 2 ) 

and (iii) the rule of ex falso quodlibet, allowing the derivation of any formula 
whatsoever from contradictory premises: 

if, from premises <5>, we have deduced the formulas 3<cx(L\ A L 2 ) 
and 3>dx(L\ A L 2 ), where D > C, then wc may deduce <j) from <f>. 

We write $ \~m 4> if there is a deduction from premises <i> to conclusion cj> in 

M. 

The system Ai is at least as powerful as that of Murphree, once notational 
differences are taken into account. And Murphrcc's system is in turn at least 
as powerful as that of Hacker and Parry. Nevertheless, M. is easily seen not to 
be complete for the language AT 1+ . For example, the valid sequent 

3> c x{p(x) A q(x)), 3> D x(p(x) A -^q(x)) \= 3> {C+D) x(p(x) A p{x)) (37) 

is not derivable in M.. (Possibly, these writers never intended their systems 
to handle conclusions of this form.) The question therefore arises as to the 
prospects for producing a complete system of syllogisms for the numerically 
definite syllogistic. 

To make the ensuing analysis more robust (and the comparison with pub- 
lished systems fairer), we consider the special case of the validity problem in 
which it is known how many objects satisfy each predicate in question, and how 
many objects fail to do so. Formally, we consider only inference problems from 
numerically explicit premise sets, in the following sense. 

Definition 2. Let $ be a set of Af 1 + -formulas, and let pi, . . . ,p n be the predi- 
cates appearing in We say that $ is numerically explicit if there exist natural 
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numbers C, C±, . . . ,C n with C > such that, for alii (1 < i < n), (i) Ci < C, 
and (ii) $ contains the formulas 

3<dx(Pi(x) Aft(x)) 3< (c _ Ci) a;(-.p i (a;) A -^(x)) 
^^Ci^biC^) A Pi(x)) 3> (c _ Ci) a;(-.pi(a;) A -^Pi(x)). 

Regarding the sequent (|37p . it is easy to show that, if <f> is any numerically 
explicit premise set containing 3>cx(p(x) Aq(x)) and 3>nx(p(x) A ^g(x)), then 
$ 3>(p + £i)a;(p(x) Ap(x)). We remark in passing that de Morgan's numeri- 
cally definite syllogisms also make reference to the assumed cardinalities of some 
of the terms they involve (de Morgan [T], p. 161). 

Unfortunately, the prospects for a complete system of numerical syllogisms, 
even for the special case of numerically explicit premise sets, are not bright. For, 
in the sequel, we exhibit a numerically explicit set of A/" 1+ -formulas <J> and an 
VV 1+ -formuia 4> such that $ |= but $ \/m <fi. Moreover, this incompleteness 
result will be seen to be relatively robust under a range of conceivable extensions 
of M. 

For readability, we shall henceforth contract A/" 1+ -formulas with repeated 
literals: thus, 3<cx(p(x) Ap(x)) becomes 3<cxp(x), etc. We use the quantifier 
3 = c to abbreviate the obvious pair of formulas involving 3<c and 3>c- And we 
write A/^-formulas of the form 3<ox(p(x) A ±q(x)) in their more familiar guise: 
\/x(p{x) — > ^fq(x)). Fix m > 6. Let A and c be as in Lemma[3J and let $i be 



the set of TV 1 -formulas consisting of 

3< 3 (m+i)Xt(x) (38) 

3> 3 xtj{x) (l<j<m + 1) (39) 

Vx(^ (x) -> t(x)) (1 < j < m + 1) (40) 

Vx(tj(x) ->• ->tj>(x)) (1 < j < j' < m + 1) (41) 

\/x( Sl {x) t(x)) (l<i<m) (42) 

Vx(tj(x) -> Si(x)) (1 < i < m, 1 < j < m + 1, A itj = 1) (43) 

Vx(tj(x) -> -^Si(x)) (1 < i < m, 1 < j < m + 1, A itj = 0) (44) 

3 =3 x(s l (x) Ar(x)) (l<i<m-l) (45) 

3 =i x(s m {x) Ar(x)). (46) 



Note that the list of quantifier subscripts in the m formulas of (|4"5|) and (|4€>[1 
matches the vector c. 

Claim 1. For all j (1 < j < m + 1), $i |= 3>ix(t J (x) A r(x)). 

Proof. Suppose 21 |= 4>i. From (f3"B")) - (|4"l"|) . i a is partitioned into the pairwise 
disjoint sets tf, ... , ^+1- And so > fr° m (|42p - ([44]) . we have, for alii (1 < i < m), 

sf = [j{tf | A hJ = 1}, 

and hence 

l*fnr*| = ]>>f nr»|:Au = l}. 
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Figure 1: Models of <I>: (a) standard semantics; (b) probabilistic semantics. In 
(b), one of the sets Rj — R' D Tj is empty. 

Therefore, P5|) implies, for all i (1 < i < m — 1), 

5>?nr a |:A iJ = l} = 3, 

while (|46f implies 

J2i\tf nr a |:A mj = l} = 4. 

In other words, n r a |, . . . , i a +1 n r a |) T is a solution of Ax = c. Applying 
Lemma SJ |t a n r a | = 1 for all j (1 < j < m + 1), which proves the claim. □ 

Now let $2 be the set of A/" 1+ -formulas 

3 = ^ m+ i)xt{x) 
3= 3 xtj(x) 

3 = 12XS m (x) 
W(rn+l)Xr(x) 



3 = 6m+3X^tj(x) 
3—Qrn — QX~ 'Syn {x) 

3 =3 ( m +i)xWx), 



(1 <j < m+l) 
(1 < i < m- 1) 



and let $ = $i U$2- Thus, <I> is numerically explicit. Moreover, $ is satisfiable: 
Fig. [IJl depicts a (in fact, £/ie) model 21 of $. The domain A has cardinality 
6(m + 1), equally split between i a and its complement; the sets tf (1 < j < 
m+ 1) partition /j a into 3-element sets; the set r a has cardinality 3(m + 1); and 
the sets i a n r a arc all singletons. The extensions of the si (not indicated in 
Fig. [IJi, for clarity) are all unions of various f a , as specified by the matrix A. 
Nevertheless, the validities reported in Claim Q] cannot all be reproduced by the 
proof-system M. . 



Claim 2. There exists a j (1 < j < m + 1) such that $\/ M 3>ix(tj (x) A r(x)). 



5 NUMERICALLY DEFINITE SYLLOGISMS 



23 



Proof. Fix N = 6(m + 1). We give a completely new semantics for the language 
J\f 1+ as follows. Let £ be the (assumed countable) set of all unary predicates 
available to 7V 1+ . Let us re-badge the elements pi,p 2 , ■ ■ ■ of £ as proposition 
letters; and, as before, we let S denote the propositional language over this 
signature. If P is a probability assignment for S, we interpret N 1+ with respect 
to P by writing 

P fa 3> c x(pi(x) Ap 2 (x)) if and only if P(p x Ap 2 ) > C/N, 

and similarly for all the other forms of 7V 1+ . (Remember that N is a con- 
stant here.) It is readily verified that the proof-system M is sound for the fa- 
semantics: all instances of the axiom schemas are true; all instances of the three 
inference rules are truth-preserving; and ex falso quodlibet is validity-preserving. 
We proceed to construct a probability assignment P such that: (i) P fa <E>, and 
(ii) for some j (1 < j < m + 1), P 3>ix(tj(x) Ar(i)). It follows that, for this 
j, ®\/ M 3>xx(tj(x) Ar(x)). 

By Lemma [21 the equations Ax = c have a solution m, . . . , u m+ \ over Q + 
with at least one zero value. On the other hand, it is obvious from examination 
of A and c that Uj must be less than or equal to 3 for all j (1 < j < m + 1). 
Let u be the least common multiple of all the (non-zero) denominators in the 
Uf, let W be a set of uN = 6u(m+ 1) objects (henceforth: "worlds"); and let T 
be a subset of W of cardinality 3m (to + 1). Now let T be partitioned into cells 
Ti, . . . , T m+ i, each of which contains 3it worlds. For each j (1 < j < m + 1), 
let Rj be a subset of Tj of cardinality uuj (which must be a natural number 
no greater than 3it), and let R' = [j{Rj \ 1 < j < m + 1}. Since R' C T, 
|T| = 3u(to + 1), and \W\ = 6u(m + 1), we may choose a set R" C W \ T 
such that the set R = R' U R" has cardinality 3u(m+l). Finally, for each i 
(1 < i < to), let Si = \J{Tj | Aij = 1}. Thus, Si has cardinality 9u for all i 
(1 < i < in — 1), and S m has cardinality \2u. This arrangement is depicted, 
schematically, in Fig.[T|D, except that the Si are not indicated, for clarity. Note, 
however, that, because u\, . . . , u m+ i is a solution of Ax = c, \Si ("1 R\ = 3it, for 
all i (1 < i < to- 1), and |5 m f]R\= 4it. 

We associate with each world w G W a truth-value assignment 6 W for the 
propositional language S by setting 9 w (si) — T if and only if w S Si, O w (tj) = T 
if and only if w G 7}, 0^(4) = T if and only if w G T, w (r) = T if and only if 
w G P, and 9 w (p) = _L for all other proposition- letters p. Then we define the 
probability assignment P by taking a flat distribution on W . That is, for every 
(j> G S, set 

P(4>) = \{W€W :9 w \=4>}\/(uN). 

It is simple to check that P fa $. On the other hand, at least one of the Uj is 
zero; and for this value of j, P(tj Ar) = 0, whence P ^6 3>\x(tj(x) A r(z)). □ 

Hence we have: 

Theorem 4. T/ie proof-system M. is not complete, even for numerically explicit 
sets of premises. 
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Theorem 2] is robust with respect to any strenthcning of Ai that is sound un- 
der the probabilistic interpretation in the proof of Claim[2] We mention that, in 
another paper, Murphrec presents a language similar to our M 2 (Murphree [5]); 
however, no systematic proof theory is developed. In fact, we are not aware of 
any published system of numerically definite syllogisms which has been shown 
to be complete. 
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